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Oral Proficiency Testing in 
an Intensive Fnglish Language Program 


Dobby Hendrichs. Georae Scholz, Rendon Spurling, 
Merino Johnson 


vi Leti sdenburg 


Several oral testing techiiques are investigated, А variant of the FST oral 
interview technique ix compared against three pragmatic: testing methods 
that are somewhat simpler to apply. AH the oral testing technines studied 
are correlated with the CESE Placement battery. Reliability indiees for the 
various stlseales on the FSI Oral Interview are reported. Scores on the 
various oral testing teehniques (twenty: eparate seures in all) are 
factor analyzed to see if oral language proficiency ean he broken down into 
component parts, The results seem to suppart a single-faetor solution, V 
mulliple-tartor alternative is examined but discarded as probably unreti- 
able. In both factor solutions, the FSU scales of Accent. Grammar. 
Vocabulary. Fluency. and C mmiprehension all load on a common factor, 
suggesting that they are unitary. Phere is ne evidenre ti support the claim 
that separate aspeets or oral proficiency can be clearly distinguished 


The need for developing efficient valid methods of testing oral proficiency 
has long been recomiized, The most obvious approach to oral testing, and the ene 
presumed to be most valid. is the oral interview. An interview provides а ven 
direct method of challenging someone to speaks and it offers a realistic г абое in 
whieh to assess overall oral mastery of a particular language, Alternate methods, 
especially less direct approaches, have heen criticized for nut re ally putting seme 
ine language abililvte the test, However, this шау he more the fault of the influ- 
emee of the typical discrete point orientation af nguage Testers in recent sears 
tiem a ies ditfien tts i d 

The diserete poin 
lestialize } 


ir dee pene bersihin dre Cres ete es 


sting fails ta provide either a von- 


Vive idee cena oP las 


anproiweh to language te; 
Fatal. 


suey amit seins iva 


TA 1: SPEAKING TASK 


року etal (1972) have indie; sl diserele pant test mas either be basei 

sample of some inventors of Jingoistie elements of a particular language. or be 
derived from some nation of functional needs However if we look beyond the 
evel of words. tie total mimber of i 


eames 


пу langnage is not finite or well 


fined: therefore, sampling techn: mes ean scarcely be applied at all, aud never 


"ту evsteniainatis. Further, a selection based өй some idea of fanctional neve. 
Sits ean never bie quite eertatu that s specific 
really necessary. The naturidbredun асл af 
^ut having fully mastered «eri з 


nari peut ve naluee of lana 


ient or set of elements is ia fact 
‚{дишише allows one to ise it with 
Lo language, În fael, the ereat e anid 

rakes at impossible for anyone to 
now" dl thes arsure in the s nue ot spairate teme of Ranwledye. Рив) 
neler of the eemmonis used methods of согор point item selection seers 
likely to provite an adequate assesment ofa person's ability to fanetion in areal 
ystir langage setting, 


To date. the most widely nsed techaiqae for valuating oral profieienes is the 
reigu Service Institute (FSD Ord Interview (Spolsky etal. 1972. Wilds. 1975. 
and Oller, 1979). The FS interview has high reliability. and because it takes place 
ina more or less natural setting, itigbelieved tu he a good determinant of: регин 
true language competence. Hs mujer drawback is that it is tim "ensuing and 
expensive to administer and score, Aliogether. about a half hour is required on the 
average for adininistering and scoring cach interview (Wilds. 1975). 

One alternative to oral inten iewing would be to develop other pragmatic test. 
of oral proficiency whieh (though they may be less direet than the oral interview] 
are not tied to discrete point methods of test construction, scoring, and interpreta: 
ton. [thas been suggested that reduced redundianes tects. such as eluze tests. mas 
provide valid replacements for the oral intervion (Clark, 1973), Oller ae! 
Hinofotis (Chap. 1) found cdrrelatinns ranging from 31 lo <62 between a elozetest 
and the various subseales of the Е oralinter iew. This eneourages ns to believe 
that sume type of cloze test (written or oral) might be refined to provide some of the 
information available through the mere expensive and t neconsuming oral inter 
Siew eohniques, 

This chapter evaluates four possible proredures lor testing oral proficiency 
Three pragmatie speaking tasks. repetition (elicited imitation), oral cloze. and 
reading aloud. are used along with a traditional FSl-type oral interview. The inter- 
view technique used here- however does not fully meet FSI requirements—in 
particular, the teams of examiners did not include a trained linguist, Results slow 
that this fact apparently did not reduce reliability. but for this reason we refer to 
the technique here as an FS type of oral interview. ‘The results of all four profi- 
tenes tests were intercorrelaled and examined in relation to the faetor results ol 
Scholz et al (Chap. 2). The es periiieatatiests were ale eurrehited with the ао 
ot the Center far Laglich as w Secon Language (LES 1) placement fest cuitsteling 
iE teste 1.1 aml 2b deseribed iin. 2 


ln additi n: to the facturiagaleady тиреги! in Chap. 4. further luetoring wa- 
done (beth a praicipal component solution info varintax rotation) to determin: 
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whieh uf the speetfealh oral tasks and which secures based on those tasks 
generated the most meaningfal variance in relation t 


the other oral ta 


Experiment 


Subjects. seventy of the 182 student: enrolled at all levels of prolieieges at lie 
Center for Puglish a- а Second Langage at Somhern Hinos Univercty at 
Carbondale served ах subjects Since all ЇЗ? CESE students were invited to 
participate. the ones who actually elected to da so were probably not completely 
representative of the whole group. In fact thes tended to be students who were 
more vonfident about their shill in English Nevertheless. their overall level af 
pH отне was not partiendacdly bizh onthe FS seales. Phe mean far Seernt 
1.68 (oit of a possible 4 pants): for Grammar it was 12.69 (possible, 365). tor 
Voeabulary it was 8.98 (possible, 24): for Flarney. 3.58 ( 12): and tort orapreli 
sion 19.38 (23). Thus on the scale in Appendix ТЄ this group woutd get a mean 
proficieney ratingof 1+ (based on a score of 39,31). and according to Appendis 
TA, which gives rough verbal d seriptions of the five profivienes levels, they 
would be able to do little more than folfill mininmm teatel needs, 


Test Materials, The CEST. Placement Te comprised of three parts 
described in Chap. 2 was used for some of the correlations reported below, 
Subtest: were aimed at Listening Comprehension, Structure. and Reading. The 
five jroficioney levels of the FS] Oral Interview are defined roughly in Appendis 
vA (taken from the Manual for Peace Corps Language Testers prepared by E 
tional Testing Service), The five subscales are similarly described in Appendix 
ТВ, Finally. the recommended weighting and conversion tables for interpreting 
scores on the five seales to assim a praficienes level rating are given az Appendis 


"ie 


In eases where the subject had very little or no comand ofthe bungee, 
interview lime was about 15 minutes or even less, hut normally the interview ~ 
lasted between 2ü and 30 minutes, Each. exchange: was tape-recorded and 
subjects were rated by two interviewers, In the weightings assigned to the various 
зарае, Grammar received thedieaviest w ight followed by Voeabulars, Come 
prehension, Flies 


and. Veeent, respectively, as shown in Appendis TC. 


Interviewers, Prospective PSJ interviewers normally niust participate m a 
program which for Peace Corps Language Testers required four to five 
dass (NTS. 1970), On the surface. the FSI Oral Interview appears to be a norna 
everyday eei ersation, hut il is supposed to br 7a speetalized procedure which 
uses the relatively brief testing period to explore mam aspects of the student stan 
Page eonmpelenee in order lo place him iu one of the categories (levels) 
deserithed” (ETS. 1970. p 40. lu order to reasonably duplicate the PSE require- 
шеш», iwo gti Ө iwo inderviewgis enel were made up frein a pei ol 
иеа three graduate assistants at CESE. Before collection of the spees 
data, all interviewers read the Manual for Расо Corps Language Lester (hto 
1979) and Ноне to the fifteen trainin ines of sacmle ESP Oral Tenens 


SPEAKING TASK- 


1 


e anten ies 


Я with the manual. A list of pos~ piws мах given to each 


ee Appendix 70), 


тише Speaking Tests. МИ three of the pragmatie test-. г petition, oral 
nd reading aloud, contained texts intended to range in difficulty fram easy 
ult, All passages were approximately seventy words in length. Each easy 
sselected from ath to 5th grade reader the intermediate texts were taken 
паг highschool reader: and the difficult passages were excerpted from a 
ade reader. The texts ofall three pragmatic tasks are given as entries 11.12, 
am the Appendix at the end of this v 


cedure, Subjects were uterviewed wit i tive werk period in the spring 
т, Uratinter views were scheduled Jura ths abject’ free time, faring die 
uent all interviews were tape-recoriled, To establish interrater re iability 
ttwelve subjects scored. hy each interviewing team were rated again by the 
eam, producing twenty-four interviews rated by heth teams, 

ve pragmatic speaking leste (repetition, oral cloze, and reading aloud) were 
stered at the CESL/SIU Language laboratory during the same six-wee 
as the oral interviews. (For detailed instructions provided to the subjects, 
: Appendix entries 1. 12. and 13.) A second score was derived by counting 
appropriate words that were included in subject? responses but not in the 
d text; and a third score was simply the amount of lime (in seconds) required 
th reading. For the pragmatic tasks each subject was individually tape- 
ed. Actual testing time was approximately 40 minutes. with an additi pal 15 
тв for seating and laboratory adjustments. 

coring. ‘The oral interviews were scored hy the па 1 FSI procedures, Both 
petition and oral cloze tests were seared hy exact and acceptable word 
ds. The reading aloud tests were scored in three ways: a first score was 
lat by counting exact word renditions (one point for е eh word). Се 
nanciations were counted as errorsin all eases. "To obtain an overall scare 
з reading aloud tasks, the number vf correct words in earh passage (exact 
aceptable) was divided by the number of seconds required to complete each 


wlz et al. in i tector anahe: 


ore used by 8 


his was the 
#2-1 and 2-4). 


Results and Discussion 


ble 7-1 indicates. the correlation between the two interviewing teams for the 
ll interview seore was 90. The-only low correlation between the ratings of 
ams was for the оссо! Subscale (13). However. the high intercarrelations 
e ether scales show that the two interview teams agreed substantially ia their 
nente of oral p Я 
Fable 7-2 indicates thal the pragmatietask which correlates most highly with 
гота FS! score is Repetition CTY: Oral Cloze fellows (03), and then the 
И Reading: Aloud store (31). Each of these correlations is significant at the 


Pats 
4 О | 
Es Hendricks et al.: Oral proficienes testing Hl 
e 
fer 
1 P Е 
a Table 7} Meterrater Reli y tae Suoscu es of the FSI Oral intervie 
ux (N = 24) a 
L Group 2 2 1 4 5 6 E] 
ptu Group 1 
Xr Measures 
| Oral Interview -Accent E X 
2 Oral interview- Gramma 83 
3 Oral Interview--Vocabulary 85 
Ф Oral Interview-Fluency 
Oral Interview—Camprehe 89 
6 Ora! ated v—Total Sen 91 
7 Converted Total Score 85 


B Overal Sco. 


01 les 
the variance in the overall FSE rating, the Oral Cloze ha 
40% with the overall FST score and the Reading Mond 2 

‘These facts can probably best he accounted for onthe basis of the rexpeetis e 
reliabilities of the various measures, Repetition appears to he a promising measure 
of oral profieiency, И correlates. signifieautls (p< .00 1) with all the subseales of 
the Oral Interview, and with the uther pragmatic tasks. In addition, Repetition may 
he useful as a diagnostic test. Subjects tended to make consistent grammatical 
errors whieh could provide invaluable diagnostic data in defining the learner's 
interlanguage system (cf Selinker. 1972), Oral Cloze alu looks promising. 
However, we believe its utility could роху be enhanced by altering the formal 
to require phrases expressing the next idea in the text rather than single words as in the 
written eloze formats. Reading Aloud is the least effective measure, but this may 


sk aevounts for 19% of 
) variance of 


Tie 


a 


Ei 


керы acy Tests 

74 1 Oral Interview—Accent 1.50 de 233] дїй ck co tee S 
Oral interview -Grammar 89 .23 .53 181 .44 .32 .06 
Oral Interview Vocabulary 1.00 77 84 93 156 45 + 
Oral Interview ~Fluency : 100 8% .67 .66 .73 91 
Oral Interview -Comprehension 1.00 .86 .75 .75 79 
Oral Interview (Overall Rating) 100  .70  .63  .5! 
Repetition 1.00 .80  .62 
Oral Claze * OU oF 
Reading Mout! 1.90 


EIL 
[ ES & .05 for sll correlations ater above ,25. 


ax 


lli: SPEAKING TASKS 


Four Ora! 
t Battery 


GESI СЕЅІ CESL 
Вел Structure Listening Totat 


cy Tests 


Oral taterview- Accent 34 19 34 

Oral Interview Grammar 47 43 EE 

Ora! Interview - Vor. tuiary - 42 En .51 
Intervie ъ= Fiucncy Й .08 07 
Interview--Ce mprehensio s 5 25 


Orsli alt Ratin’ 3 Ec] 37 
Repetition i 25 35 
Oral Cloze 3 39 39 
Reading Aloud 1 03 % 


<.05 for all correlations above 


due to the seoring method used, We return to the latter problem below in the 
‘cussion of the factor analyses in Tables 7-4 and 7-3. There is a hint in Table 7-2 
ıt the Reading Aloud task measures Fluency (г=.91). Comprehension (r=.79). 
d to a lesser extent Vocabulary (7.44) more than whatever is measured by the 
ver scales, However, as we will see beluw. this interpretation does not fil the 
stor results, 

Table 7-3 shows the relationship between the oral proficieney tests and the 
ZSL Placement battery. The CESL battery consistently correlates best with the 
al Interview Grammar and Vocabulary scores. Thes share a maximum common 
riance of about 28%, On the whole. the Oral Interview is roughly equivalent to 
petition and Oral Cloze as a predictor of the CESL Placement battery. 

The factor analysis resulte reported in Table 7-4 are excerpted from Oller 
219. Appendix Table 5), Here it should be remembered that only the Е seal 
> really full-blown testing procedures, The other scores ineluded are actually 
bseores from the Repetition, Ural Chize. and Reading Alowd els defined 
ove, Nevertheless in apite of the reduced reliability that should be expeeted 
леп stored аге based on such shortened subtests. the loadings on a common 
incipal factor (g) are nearly all syuifieamt at the (01 level. Only (ive of the sube 
ores included fail to hiad significantly ong. Atleast ane of the subseores in each 
Caccounts for 45 or more of the variance in the common factor. 

The fact that the Acceptable word «cores for Repetiti n tasks А, D. and Û inad 


consistently (as well as lightly oc insignifieantly) on g is easily understood, If 
hjects rendered the texts carreeth bs the exact word eriterion (ie. a verbatim 


petition) there was little likeliba ut their getting any additiunal points for 
teeptable words not in the original text, The same explanation applies even mure 
uativally to the Reading Aloud Veveptabiie scores. wbich alt tend to load neg i 
vly ong. This is due to tae fact that the more the subject tends to render the texts 
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Table 7-4 — Principal Comronente An s over Twentyseven 
Speaking Scores {А = 64.7 
Scores” Loadings on g Squiic 


Oral Interviewe Accent 
Oral'tatervicw-- Grammar 
Oral Intervie ы V'ocabul 
Oral Intérvlew--Fiuency 
Oral Interview—Comprchension 
FS] Oral Proficiency Le 


Repetition [xate A Ж 
Report 

Repetition 
Repetition E 
Repetition Exact © 339 .15 
Repetition Acceptable C. 0 


. Oral Cloze Exact A 54 
е Ога! Cloze Acceptable А .66 
Oral Cloze Exact В $9 

` Oral Cloze Acceptable B ESI 
Oral Ctoze Exact C .53 

Ога Cloze Acceptable C 5; 


- Reading Aloud Time А. - Ё ا‎ 
‘Reading Aloud Exact-A 

Reading Aloud Acceptable A 

Reading Aloud Time B - 
Reading Aloud Exact B 

Reading Aloud Acceptable В 
Reading Aloud Time C 

„ Reading Aloud Exact C. Е 

^ Reading Aloud Acceptable С 


эф us oF 


оа гол 


Eigenvalue = 9,39 


*Loadings above .33 are sig antatp «.01, those above .25 atp < (05; 
in this case and in Table 7-5, the deletion of missing cases was listwise (Nic 
et aL, 1975), and therefore subject population was the-same for each 
and every test score. 


exaetly, the less opportunity they have for creative innovations that would gain 
points under the Acceptable word scoring method. The Oral Cloze tasks. however. 
were sufficiently challenging g that Acceptable seores aecuunt fer about as тиесі af 
the variance ing as do the Exact scores. As would be expected, the Reading Aloud 
Time scores are negatively related to the global oral-profic 
proficient subjects required less time: E 
proficienes 
Moreover. hy studying the loadings ong for the Readi 
appareut that by including the Meceptabie scores in 
all Reading Aloud scores, we neesscarily depressed the effectiveness of Reading 


псу factor. Tl he nire 
the longer the time tlie lower the 
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thle 7-5 Va venty-seven Speaking Scores’ 
564) 
Factors Qe SR oue OO AW ср зр уй 
Otei > - 
z interview -Areeni + 23. 
Jl Interview = Grammar z 7 
‘al Interview Vocab 81 
3 69. 
7 _ 16 
Oral Proficiency Level АЧ 85 
айоо Exact А " 253 15 
5 73 53. 
18 
"petition Accenr dle à EM 
petition Exact С AD 48 
petition Acceptaba 38 34 
val Cloze Exact А Ad 24 
ral Cloze Acceptable А 25 56 
cat Cloze Exact B hg ES AT 
ral Cloze Acceptable В 56 74 
ral Cloze Exact C та 32 
‘al Cloze Acceptable C 6 33 
sading Aloud Time A -.82 67 
fading Aloud Exact А ا‎ 45 td 
sading Aloud Acceptaoie A 26 
sading Aloud Time В —.84 лі 
габіпр Aloud Exact B .63 40 
rading Aloud Acceptable В Ted 52 
sading Aloud Time C —.82 67 
tading Aloud Exact C AS 56 151 
zading Aloud Acceptable С лэ 53 


Eigenvalue = 15,22 
i a a AL 


oud as a measure of global proficieuss. This fact probably explains (at least in 
ry the results obtained in Tables 7-2 and 7-3 where the Reading Aloud overall 
re (defined as the quantity, Exaet pins Acceptable seore. divided by Time) 
rformed less well than Repetition and Oral Gloze in relation to the FSI anid 
i3Lsubseores, All the loadings ong in Table 2-1 of Repetition. Oral Cluze. and 
ading Aloud could probably be improved by taking advantage of what ean he 
imed from Table 7-1 concerning epumum scoring methods. 

la Table 7-3 a varimax rotated solution is presented over the same da 
tile the single-factur solution accounts for only 35% of the total vacianee in all 
авот whereas the rotated solution (with seven signifiant factors) aceount- 
an additional 2175 (for a total of 36 o of the varianve). the rotated solution t- 
ally parsimiouiuas in interpretation. 

All tite FSI scores load ou Factor i (Table 7-3). W e alight have expected th. 
euey valetoload ona faetor coun arto the Thue scores for the Reading Moni 
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suselves on Factor 4. The Repeti- 
atter their variances over fivi ofal п arthezonal (ke. unrorr 
lated) factors, In particular, the Exact seares tend to load on Factor 2 along with 
the Acceptable scares forthe Oral Glaze tasks, whilé the Areeptableseures for thi 
Repetition asks fall out owl actor б, These clusters of foadit ps аге probably id 
more to unrelighilities in the tasks (probably: because of their brevity) than to 
teliakdedifferences in the nature of the processius skills exercised, 

сагу Него is no evidence that the severabE S1 scales are m suring lifter 
ent skills or coniponents of dial proficiency. Further. in view of the faet that the 
loadings on the varimax rotated riors are seareely any higher in Table 7-3 than in 
Table 2-4 (whieh offers а single Ё -tarsolution faran even wider range nf tasks). it 
is asstimied {hal the FSI seales aretinitary and that they measure the same basic 
shill that underlie. perfurmanee an the three oral pragmatie tasks (especially 
Repetition scored by the Ехае word method. Oral Соле scored by the appropr i 
ше ward method. and Reading Aloud seared simply in terms of the number of 
vends it takes subjects to complete the task). 
` Finally, in view of the. estimated reliahilities of the vations scores, it seems 
reasemiable to conelude that the unitary factor solution. presented in Table 7-1 
explains the bulk of the reliable variance in all the tasks considered and that the 
additional varianee accounted for in the riiltiple-faetor solution shown їп Table 
4-5 is indeed unreliable Varianee. The latter claim will require further empirival 
` substantiation, but in the meantime it séems safe to say that Repetition, Oral 


Cloze. and Reading Aloud all offer some promise as substitutes for oral interview 
testing, 


Кач», However. the Time senres all load by the 
lian scores > 


Appendix 7A 
The Five Levels of Overall Proficiency 
of the FSI Oral Interview 


Level 1 


bleto satis]v routine travel needs and minimum courtesy requirements, The 
student ean answer questions on topics very familiar to hine within the seope of his 
very limited langnage experienee he ean understand simple questions and 
statements, allowing for slowed speech. repetition. or paraphrase. His speaking 
vocabulary is inadequate to. express anything but the most elementary needs: 
errors in pronuaeiation and grammar ace frequent but he ean be understood bs a 
Hatte speaker used ta dealing with отш ег» alteapting lo speak his Lun 


Del 2 


Able to sutiyy routine social demands amd limitis work requirements, The 
student ean handle with confidence hat ow with faeititt met one at 
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irent eventa, ads 
hamile fi 
tions or difficulties, He ean 


ding introduction: aml самый conversations thet! 
work. family, and autobiographical information. i 
equirements needing help in handling any complit 
æt the gist of most convers ions on nonterhmiral subj (.с.. topies which 
equire no spe alized knowledge) and has a speaking vocabulary sufficient tir 
press himself with some circurmbocutions, His аспен. though often quite faults. 
s intelligible. He сап handle elementere construrtinns quite accurately bul doe 
nnt have thorough or confident control of the granar. 


Level 3 

Able ty speak the langua; 
lary to participate offactively 
practical, social, and. projessienat горіх, The student ean discuss. particular 
interests and specie! fields of competence with reasnmable ease. His comprehen- 
sion is quite complete for a norm 1 rate of speech, His vo abulars is broad enough 
that he rarely has w grope for a word, His accent may be obviously fore Hic 
control of grammar is good, and his errors never interfere with understanding and 


rarely disturb the native speaker. 


Level 4 

Able to use the language fluently and accurately on all levels normally pertin- 
ent to professional needs. "The student ean understand and participate in any 
conversation within the range of his experience with a high degree of fluency and 
precision of vocabulary. He would rarely be taken for a native speaker, but he can 
respond appropriately even in unfamiliar situations. His errors of pronum tien 
and grammar are quite rare. and he can handle informal interpreting (rum and inte 
the language. 


Level 5 

Speuking proficiency equivalent to that of an educated natit 
student has complete Mueney in the language weh that his speech on all levels is 
fully accepted by educated native speakers in all its features, including breadth of 
vorabulary and idis. t Узори аат, and pertinent cultural references. 


ubu- 


ith sufficient structurel accuraey und 
n mest forral (iid informal conversatio 


speaker. The 


Appendix 7B 
Proficiency Descriptions 
Accent 


д. Pronuneiation frequently unintellidible. 


2, Frequent rods errors auda vers hevs y ас 
"INE; e 


nt ike unúerstanding deti it. 


Tec 


TM 
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Foreign arent? requires eoneentrated fi 


ning. 


ub mispronin 


Sead te ional misunderstanding and apparent errar i grammir oc w 
subulary. 
: Marked "foreign accent" and ocrasional misprononsation= whieh do nat 


interiere with understanding. 

No conspicuous misprominciati 
speaker. >- 
Native pronunciat 


ns, but would nol be taker for a nativ 


with no trace of 7forelim 


id" Grammar 


Lo Grammar almost entirely inaccurate 
2. Constant errors showing control of very few major patterns and feit 


th 


oatrolled and сапы 


4. Qeeasional errors howing trüperfeet control of some teme but no weak- 
ness that causes misunderstanding, 


25. Few errors, with (ew patterns of failure, 


77 f. No more ‘than two errors during the interview. 


Vocabulary 


1. Yoeahulary inadequate for even the simplest conversation. 

2. Vocabulary limited to-basic personal and survi al areas (time. food. tran-- 
portation, family, cte). 

3. Choice of words sometimes inaccurate: limitations of vocabulary prevent 
discussion ofsome common professional and social topics. : 

4. Professional vocabulary adequate to discuss special interests: general 

© vocabulary permits diseussion of any nont ehuieal subject with sume vir 

7 cumlocutions. х 

7$. Professiunal vocabulary broad and precise: general vocabulary adequate to 
cope with complex practical problems and varied social situations. 

6. Vocabulary apparently as accurate and extensive as that of an educa 
native speaker. 


Fluency у 
1. Speech is зо halting and fragmentary that conversation is virtually impos- 
sible. 
Speech is very slow and uneven exeept iur short or routine 
Speech is frequently hesitant and jerky: sentences may be left incomplete, 
Speech is aveasionally hesitant. with some unevenness caused by rephrasing 
and groping for words. j à ^ 
5. Speech is effortless and smooth. but pereeptibly nonnative ш д 
evenness, 


nletnees. 


d 


and smooth ao 


ND 
Sapa an d gn 


muise speaker s 


ШИ 
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Comprehension 


"n erat 


Understands too little for the simplest kind of 
Understands only slow. sirnplesspeerk on common ~ 
tapies: requires constant repetiting aml rephrasing. 
Understands careful. simplified speech directed to hi 
sional repetition or rephrasing, 

Understands quite well normal edueated speech directed to him, but 
requires occasional repetition or rephrasing. 

Understands esercthing in пого ited conversation except for very 
colloquial or low-frequency items. er exceptionally rapid or shirred speech; 
Understands everthing in very formal and colloquial speech to he exper 
of ated пасхе speaker. 


чл and tonristie 


but requires 


vde 


ner 


Appendix 7С 
FS! Score Weighting Table 


Proficiency description 1 al 3 E] 5 6 

Accent 0 H 2 2 3 4 

Grammar 6 12 18 24 30 36  —— 
Vocabulary 4 x i2 16 20 24 ج‎ 
Fluency 2 i 6 8 10 12 a M 
Comprehension " 4 5 1294 *15 19 СУ و‎ amer 
Total کے‎ 


Conversion Table 


Total score Leve! Total score Level 
{from weighting table) 
16-25 0+ 63-72 3 
26-32 1 73-82 3t 
33-42 1> 83-92 a 
43.52 1 93-99 4+ 
53-62 29 


Interview topics 


Appendix 7D 
fntersiew Topies 
nt Tense 
Your ustral dav 
Your hobby 
Your present job 
Your peLeronom: 
The thing vou ¢ 


slike most 
Your home, apartment. or ronm 
Your home tow 
Satirday afternoon in sour home tour 
\ holiday 

Your favorite animal 

The problems of an onb child 
Your big (or litle) brother (or st 
Your father's favorite savings 
Why you are in school 

Your favorite subject (teacher. сі 
What education means to vou 
Are examinations ne 
Life in a college dormitory 
Differences between high school and 
The importance of sports in college. 
Your best friend 

Your worst enemy 

An interesting person vou know 

The strangest persun you know 
Traffic problems 

The best kind of vacation 

Why you like (domi like) television 
What makes a good movie? 

Small towns versus large towns 


mate) at school 


Lege 


A frightening experience 
Your most embarra 
Your biggest : 
What vou did last weekend 
Ап папите story vou told 
Your most interesting trip 
Your first long tri 
An importat суеш in sour lile 

One time when vou were misunderstood 


voti deeded io cume 49 this scliecl ar eis 


¢ tiennent 


BY 


HI: SPEAKING TASKS 


A folk tale 
X masir ar play уни enjoyed (or didit enjos) 


mure Tense 

The world in the year 2000 

What will probably happen in the next 6 n 
‘Things you intend to do 

Whet you will probably do tomorrow 

Your plans for a vacation 

Your plans for next weekend (next vean 

. ‘Phe plans voi kaye lee your children ur гед 


ould or Imperat 


. How to be a good tonrist 

SO Travel tips 

|, How to bake a саке, a pie (recipes und instructions) 
Ll. Should married women work outside the home? 


»aditional 
n dollars 


M you had a mil 


1. 
2 

3. 
f 
а. 
6. 
т 


епу 
В. If you were the last person alive 
lirect und Indirect Speech 


1. A conversation you had this marming 
2. A conversation you overheard 


Chapter 


Rater Reliability 
and Oral Proficiency Evaluations 


14. Mullen 


This investigation attempted to determine (1) whether judges working 
dependently аге apt to rea h similar conelusious regarding the oral 
peuficieney of nonnative speakers of English: (2) whether the ratings of one 
pair of judges are similar to the ratings of other pairs of judges: (3) whether 
four subscales aimed at supposedly different aspects of proficiency actually 
contribute nonredundant information, Ninety-eight nonnative speakers of 
English were divided into aix groups. Five interviewers were paired to form 
six teams, Each group of subjects was rated on five seales-- listening 
comprehension. pronunciation. fluency, grammar, and overall prof 
siency-by one of the six teams of interviewers. The results show that 
esigue bv one pulge ina pair are significantly different from rating= 
assigned by the other in seme cas ss Howes ег, the reliability eocfficients are 
above .70 for all but two pairs uf judges. Hence an average of the ratings by 
two judges is а better measure of proficiency than a rating by a single judge. 
The results also show that on all scales exeept grammat. the six groups were 
rated similarly. The fur subscales were gis en approximately ( qual weight 
and all taken together (rather than singly or in p irs or triplets) best predict 
overall oral proficiency. The overall scale shows a high reliability across all 
groups, and since it appears to be à eamposite of the four subs 
vonsidered the best measure of oral proficiency. 


ШИП 


— —— 


Testing speaking proficieuey has heen of spe al interest in the field of 
second language learning and it has generally been recogn that the best way 
test for oral nrofieienes is to bave a subject speak. However, the issue vf what and 
how to test has renamed. Five basie components of speaking skill lave been 
vrupused bv Harris (1969: pronuncintien. graminar, vocabulary, йчепгу. and 


